AITopics | Bình Định Province

Collaborating Authors

Bình Định Province

Towards Cultural Bridge by Bahnaric-Vietnamese Translation Using Transfer Learning of Sequence-To-Sequence Pre-training Language Model

Dat, Phan Tran Minh, Khang, Vo Hoang Nhat, Tho, Quan Thanh

arXiv.org Artificial IntelligenceMay-19-2025

This work explores the journey towards achieving Bahnaric-Vietnamese translation for the sake of culturally bridging the two ethnic groups in Vietnam. However, translating from Bahnaric to Vietnamese also encounters some difficulties. The most prominent challenge is the lack of available original Bahnaric resources source language, including vocabulary, grammar, dialogue patterns and bilingual corpus, which hinders the data collection process for training. To address this, we leverage a transfer learning approach using sequence-to-sequence pre-training language model. First of all, we leverage a pre-trained Vietnamese language model to capture the characteristics of this language. Especially, to further serve the purpose of machine translation, we aim for a sequence-to-sequence model, not encoder-only like BERT or decoder-only like GPT. Taking advantage of significant similarity between the two languages, we continue training the model with the currently limited bilingual resources of Vietnamese-Bahnaric text to perform the transfer learning from language model to machine translation. Thus, this approach can help to handle the problem of imbalanced resources between two languages, while also optimizing the training and computational processes. Additionally, we also enhanced the datasets using data augmentation to generate additional resources and defined some heuristic methods to help the translation more precise. Our approach has been validated to be highly effective for the Bahnaric-Vietnamese translation model, contributing to the expansion and preservation of languages, and facilitating better mutual understanding between the two ethnic people.

machine learning, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

2505.11421

Country:

Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.06)
Asia > Vietnam > Bình Định Province (0.05)
Asia > Vietnam > Kon Tum Province > Kon Tum (0.05)
Asia > Vietnam > Gia Lai Province (0.05)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges

Van Dinh, Nguyen, Dang, Thanh Chi, Nguyen, Luan Thanh, Van Nguyen, Kiet

arXiv.org Artificial IntelligenceOct-4-2024

Vietnamese, a low-resource language, is typically categorized into three primary dialect groups that belong to Northern, Central, and Southern Vietnam. However, each province within these regions exhibits its own distinct pronunciation variations. Despite the existence of various speech recognition datasets, none of them has provided a fine-grained classification of the 63 dialects specific to individual provinces of Vietnam. To address this gap, we introduce Vietnamese Multi-Dialect (ViMD) dataset, a novel comprehensive dataset capturing the rich diversity of 63 provincial dialects spoken across Vietnam. Our dataset comprises 102.56 hours of audio, consisting of approximately 19,000 utterances, and the associated transcripts contain over 1.2 million words. To provide benchmarks and simultaneously demonstrate the challenges of our dataset, we fine-tune state-of-the-art pre-trained models for two downstream tasks: (1) Dialect identification and (2) Speech recognition. The empirical results suggest two implications including the influence of geographical factors on dialects, and the constraints of current approaches in speech recognition tasks involving multi-dialect speech data. Our dataset is available for research purposes.

dataset, dialect, experiment, (17 more...)

arXiv.org Artificial Intelligence

2410.03458

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
Asia > Vietnam > Thanh Hóa Province > Thanh Hóa (0.04)
Asia > Vietnam > Hưng Yên Province > Hưng Yên (0.04)
(65 more...)

Genre: Research Report > New Finding (0.66)

Industry: Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension

Ngo, Thinh Phuoc, Dang, Khoa Tran Anh, Luu, Son T., Van Nguyen, Kiet, Nguyen, Ngan Luu-Thuy

arXiv.org Artificial IntelligenceFeb-4-2024

This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks. The existing MRC corpora in Vietnamese mainly focus on formal written documents such as Wikipedia articles, online newspapers, or textbooks. In contrast, the VlogQA consists of 10,076 question-answer pairs based on 1,230 transcript documents sourced from YouTube -- an extensive source of user-uploaded content, covering the topics of food and travel. By capturing the spoken language of native Vietnamese speakers in natural settings, an obscure corner overlooked in Vietnamese research, the corpus provides a valuable resource for future research in reading comprehension tasks for the Vietnamese language. Regarding performance evaluation, our deep-learning models achieved the highest F1 score of 75.34% on the test set, indicating significant progress in machine reading comprehension for Vietnamese spoken language data. In terms of EM, the highest score we accomplished is 53.97%, which reflects the challenge in processing spoken-based content and highlights the need for further improvement.

corpus, dataset, transcript, (16 more...)

arXiv.org Artificial Intelligence

2402.02655

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
(7 more...)

Genre: Research Report (0.40)

Industry:

Education > Assessment & Standards > Student Performance (1.00)
Media > News (0.86)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

ViLexNorm: A Lexical Normalization Corpus for Vietnamese Social Media Text

Nguyen, Thanh-Nhi, Le, Thanh-Phong, Van Nguyen, Kiet

arXiv.org Artificial IntelligenceJan-31-2024

Lexical normalization, a fundamental task in Natural Language Processing (NLP), involves the transformation of words into their canonical forms. This process has been proven to benefit various downstream NLP tasks greatly. In this work, we introduce Vietnamese Lexical Normalization (ViLexNorm), the first-ever corpus developed for the Vietnamese lexical normalization task. The corpus comprises over 10,000 pairs of sentences meticulously annotated by human annotators, sourced from public comments on Vietnam's most popular social media platforms. Various methods were used to evaluate our corpus, and the best-performing system achieved a result of 57.74% using the Error Reduction Rate (ERR) metric (van der Goot, 2019a) with the Leave-As-Is (LAI) baseline. For extrinsic evaluation, employing the model trained on ViLexNorm demonstrates the positive impact of the Vietnamese lexical normalization task on other NLP tasks. Our corpus is publicly available exclusively for research purposes.

bartpho syllable, computational linguistic, normalization, (14 more...)

arXiv.org Artificial Intelligence

2401.16403

Country:

Asia > Vietnam > Bình Định Province (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

A Novel Explainable Artificial Intelligence Model in Image Classification problem

Cao, Quoc Hung, Nguyen, Truong Thanh Hung, Nguyen, Vo Thanh Khang, Nguyen, Xuan Phong

arXiv.org Artificial IntelligenceJul-9-2023

In recent years, artificial intelligence is increasingly being applied widely in many different fields and has a profound and direct impact on human life. Following this is the need to understand the principles of the model making predictions. Since most of the current high-precision models are black boxes, neither the AI scientist nor the end-user deeply understands what's going on inside these models. Therefore, many algorithms are studied for the purpose of explaining AI models, especially those in the problem of image classification in the field of computer vision such as LIME, CAM, GradCAM. However, these algorithms still have limitations such as LIME's long execution time and CAM's confusing interpretation of concreteness and clarity. Therefore, in this paper, we propose a new method called Segmentation - Class Activation Mapping (SeCAM) that combines the advantages of these algorithms above, while at the same time overcoming their disadvantages. We tested this algorithm with various models, including ResNet50, Inception-v3, VGG16 from ImageNet Large Scale Visual Recognition Challenge (ILSVRC) data set. Outstanding results when the algorithm has met all the requirements for a specific explanation in a remarkably concise time.

artificial intelligence, explanation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.04137

Country:

Asia > Vietnam > Bình Định Province (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.87)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.72)

Add feedback

Eyenuk AI Techlology Selected for Diabetic Eye Testing in Vietnam Supported by The Fred Hollows Foundation

#artificialintelligenceJul-26-2021, 13:45:18 GMT

Eyenuk, Inc., a global artificial intelligence (AI) medical technology and services company and the leader in real-world applications for AI Eye Screening, announced that its EyeArt AI system for diabetic eye testing has been chosen for deployment in 4 hospitals in Binh Dinh Province, Vietnam. The project is funded by The Fred Hollows Foundation. "We are excited to implement the EyeArt AI system to expand our capabilities in detection and treatment of diabetic retinopathy. It will help us reach our goal to protect the vision of approximately six million people with diabetes living in Vietnam," said Pham Quoc Anh, Vietnam Country Manager for The Fred Hollows Foundation. "This important project will help us continue the work first started by Professor Fred Hollows 29 years ago, fulfilling his vision to bring equitable eye health for all."

eyeart ai system, foundation, fred hollows foundation, (7 more...)

#artificialintelligence

Country:

Asia > Vietnam > Bình Định Province (0.26)
Asia > East Asia (0.08)
Oceania > Australia > New South Wales > Sydney (0.06)
(5 more...)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.34)

Add feedback